See previous examples for typical problems in a data analytics context.
Vast variety of potential bottlenecks. Hard to give general advice.
Programming with Big Data
Which basic (already implemented) R functions are more or less suitable as building blocks for the program?
How can we exploit/avoid some of R’s lower-level characteristics in order to implement efficient functions?
Is there a need to interface with a lower-level programming language in order to speed up the code? (advanced topic)
Independent of how we write a statistical procedure in R (or in any other language, for that matter), is there an alternative statistical procedure/algorithm that is faster but delivers approximately the same result.
Issues to keep in mind
Vectorization.
Memory: avoid copying, pre-allocate memory.
Use built in primitive (C) functions (caution: not always faster, if aim is precision).
Existing solutions: load additional packages (read.csv() vs. data.table::fread()).
Focus of what follows in this course (approach taken in Walkowiak (2016)).